Model Selection

Large-scale Image Understanding

# Large-scale Image Understanding

Vit So400m Patch14 Siglip 378.webli

A vision Transformer model based on SigLIP, containing only an image encoder, utilizing the original attention pooling mechanism.

Image Classification

Vit Large Patch14 Clip 224.laion2b

Vision Transformer model based on CLIP architecture, specialized in image feature extraction

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase